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Abstract 

We show that there is a constant K > such that for all N, s G N, 
s < N, the point set consisting of iV points chosen uniformly at random 
in the s-dimensional unit cube [0, l] s with constant probability admits an 
axis parallel rectangle [0, x] C [0, l] s containing K\J sN points more than 
expected. Consequently, the expected star discrepancy of a random point 
set is of order ^/s/N. 

1 Introduction 

Discrepancy theory [BS95] deals with different types of uniformity questions in- 
cluding balanced colorings of hypergraphs, rounding problems, or balancing games. 
A discrepancy topic of particular interest in numerical analysis is the study of uni- 
formly distributed point sets and sequences |Nie92|. [DT97] . Such constructions are 
the basis of Quasi-Monte Carlo integration, the degree of their uniformity can be 
used to derive upper bounds the integration error. 

Among several types of uniformity notions, the one of the star discrepancy 
seems to be the most common one. Let N, s G N. Let P C [0, l] s with \P\ = N. 
For x G [0, 1} S , let us call the set [0,x] := rii=i[0' x i] a box (other names in use 
are anchored boxes or corners), and denote by B := {[0, x) \ x G [0, l] s } the set of 
all these boxes. Denoting the Lebesgue measure of a measurable set B simply by 
vol(-B), the star discrepancy of P now is defined by 

D*(P) := su-p\±\PnB\ -vol(S)|. 

B<=B 
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It is thus a measure of how well P satisfies the aim of being uniformly distributed 
with respect to all boxes (in the sense that each box contains a fraction of points 
equal to its share of the volume of the whole unit cube). 

The connection to numerical integration is made, among others, by the Koksma- 
Hlawka inequality |Kok43t IHla61j . which states that the integral L 1 , s f(x)dx is 

well approximated by the average J2 P ep f(p) "when P has low discrepancy: the 
approximation error can be bounded by the product of the star discrepancy of P 
and a certain variation measure of /. This and similar results explain the enormous 
attention low-discrepancy point sets and sequences attracted. 

The classic view on low-discrepancy point sets is to regard the asymptotics 
in N, the number of points, assuming the dimension s to be fixed. Interestingly, 
random point sets are far from having the optimal behavior — their expected dis- 
crepancy is easily seen to be at least of the order of iV -1 / 2 . A number of deep 
results of the last good 70 years provide point sets having a discrepancy of order 
(i/AOQogiV)*- 1 , see again |Nie92] . |DT97j or |MatlO] . 

More recently, it was noted that this discrepancy behavior, and in particu- 
lar, taking the dimension s as a constant, is not very useful in many practical 
applications. If s equals 360, as in some applications in finance, the number 
iV of points must be prohibitively large to let the term (l/iV)(log iV) s_1 sink 
below the trivial bound of 1. Consequently, Heinrich, Novak, Wasilkowski and 
Wozniakowski [HNWWOT] started the quest for bounds and construction that 
have a better, in particular polynomial, dependence on s. Among other results, 
they show that the minimal star discrepancy of an iV-point set in the s-dimensional 
unit cube is 0{^Js/N). Interestingly, this bound is already obtained by a random 
point set. More precisely, the proof of Theorem 3 in [HNWW01J shows that a 
random point set with probability 1 — exp(— 0(c)) has a discrepancy of at most 
c\/ s/N. This also implies that the expected discrepancy of a random point set is 
0(^/s/N). See also Aistleitner |Aisll] for an elementary proof of the minimum 
star discrepancy bound, which in addition also states a not too large value for 
the constant implicit in the previous 0(yJs/N) bound. See |AH12j for an explicit 
proof that the star discrepancy of a random point set exceeds c^s/N at most with 
probability exponential in — c. 

No matching lower bounds for the minimal star discrepancy are known, the 
best one is Q(s/N) by Hinrichs |Hin04] (of course assuming s = O(N)). Closing 
this gap is one of the big open problems in this field. 

Surprisingly, not even a lower bound for the discrepancy of a random point set 
is known. This is the objective of this note, where we give a simple proof that the 
upper bounds given in [HNWWOTj lAisll] are asymptotically tight. 

Theorem 1. There is an absolute constant K such that the following is true. Let 
N,s G N such that s < N. Let P be a set of N points chosen independently and 
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uniformly from [0,1} S . Then the expected star discrepancy satisfies E[D*(P)] > 
Ky^s/N. The probability that D*(P) is less than this bound, is at most 1 — 
exp(— cs), where c is another absolute constant. 

To keep this note simple, we did not try to find good explicit values for the 
constants K and c. 

2 Proof 

The proof of Theorem [1] is elementary and only relies on a well-known fact, namely 
that binomially distributed random variables with constant probability deviate 
from their expectation (in both directions) by an amount of order the square root 
of the expectation. 

We use this fact as follows. Starting with the box B = [0,x] s being the full 
box, that is, x = (1, . . . , 1), for i from 1 to s sequentially we reduce Xi from 1 
to 1 — 1/s if this increases the excess of points in B. By the above fact, in each 
such iteration with positive probability we gain an excess of Q{^jN/s) points in 
B, leading to a box having 0(a/ sN) points more than it should. 

To keep this note self-contained, we first prove a probabilistic statement sat- 
isfying our needs (and not much else). Again, none of the constants have been 
optimized. 

Lemma 2. Let X be a random variable distributed according to a Binomial distri- 
bution with parameters n > 16 and 1/n < p < 1/4. Then Pr[X < pn — ^Jpn/2] > 
3/40. 

Proof. We first prove a slightly stronger statement for p = 1/2. By the facts 
that Pr[X = i] = Py[X = n — i] is symmetric, maximal for i = \n/2\, and 
Pr[X = [ n /2j] < l/(4y/n) for all n > 16 by Stirling's formula (see, e.g., |Rob55] ) . 
we have 

Pr[X < n/2 - (l/2)y/n/2] = (1 - Pt[\X - n/2\ > (l/2)y/n/2])/2 

> (l-(v^72 + l)Pr[X= [n/2\})/2 

> (1 - (v^72 + l)(l/(4v^)))/2 > (1 - 1/4) j2 = 3/8. 

To prove the lemma for arbitrary p, we use a detour via the Binomial dis- 
tribution with parameters n and 2p. To this aim, let Y be a random subset of 
[n] := {1, . . . , n} such that for all i G [n] independently, we have Pr[i G Y] = 2p. 
Let Z be a random subset of Y such that for all i G Y independently, we have 
Pr[i G Z] — Clearly, z :— \Z\ is distributed according to a Binomial distribution 
with parameters n and p. 
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Since y := \Y\ is binomially distributed with parameters n and 2/n < 2p < 1/2, 
its median is at most \2pn~] . Consequently, the probability that y is at most 2pn, 
is at least 1/2 — Pr[y = \2pn\] > 1/2 — 3/10 = 1/5, since Pi[y = \2pn\] is maximal 
for 2pn attaining the smallest possible value 2 and n > 16 being equal to 16. 

Conditional on the outcome of y, and assuming 1 < y < 2pn, we have 

Pr[z <pn- [l/2)y/pn\ > Pr[z < y/2 - (l/2)y/y/2] > 3/8. 

Here we used that x (->• x/2 — (1/2)a/x/2 is increasing for x > 1/8. Note that if 
y = 0, we trivially have Pr[z < pn — -y/pn] = 1 > 3/8. Consequently, 

2pn 

Pr[z < pn — (l/2) v /pn| > Prfy = i] Pt[z < pn — {l/2)yjpn \ y = i] 

> (3/8) Pi[y < 2pn\ = 3/40. 

□ 

To prove the main result, let us for a given iV-point set P denote by 

exc(.B) := |P n B\ - Nvol(B) 

the excess of points in a measurable set B. Hence this is a signed discrepancy 
notion without normalization by N. 

Let P be a set of iV random points chosen independently and uniformly in 
[0, l] s . Since we do not care about the constant in our main result, we may 
assume that iV > 64. For the same reason, for s < 4 we may simple regard 
any box B of volume 3/4 and invoke Lemma [2] to see that with probability at 
least 3/40, its complement contains a/ N/A/2 points less than expected, implying 
cxc (B) > y/N/i/2. Finally, we may assume that s < N/4. If s is larger (but still 
at most iV as assumed), we may project P onto its first s' = \_N/4\ coordinates, 
apply the result and find a box B' C [0, l] s with large excess and note that 
B := B' x [0, l] s ~ s is a box in [0, l] s having the same excess. 

Hence we may assume in the following that N > 64 and 4 < s < N/4. 

We will now inductively define numbers x±, . . . ,x s G {1 — l/s,l} such that 
B{ := Y\ l j = i[0 } Xj] x [0, 1} S ~ 1 roughly has an excess iy/N/s points and surely has 
a nonnegative excess for i = 0, . . . , s. Note that any choice of x%, . . . , x s gives 
vol(Bi) > (1 - l/s) s > 1/4. 

Note that B = [0,1} S is already defined and has an excess of zero. Assume 
that for some < i < s we have fixed x%, . . . , Xi, and consequently, B , . . . , B i: 
and that B{ has a nonnegative excess. Note that given all this, Bi contains Ni : = 
\B{ PI P\ > iVvol(-Bj) > N/4 points, all whose (i + l)-st to s-th coordinate are 
independently and uniformly distributed in [0, 1]. 
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Consequently, the rectangle C i+ i := n*- =1 [0, Xj] X [1 — 1/s, 1] x [0, l] s_4_1 C 
Bi contains each of these Ni points with probability By Lemma [2j with 

probability at least 3/40, the rectangle Ci+% contains at least ■sj Ni / s / 2 points 
less than the expected value Ni/s. In this case, put x i+ \ := 1 — 1/s, implying 
B i+ i = Bi \ C i+ i, otherwise put x i+ i = 1, implying B i+1 = Bi. In the first case, 
the excess of B i+t satisfies 

exc(5 m ) = \P n B i+1 \ - Nvol(B i+1 ) 

= \Pr\Bt\ - |PnC <+1 | - N(l - l/s)vo\(Bi) 

>Ni- (Ni/s - y/Njl/2) - N(l - 1/s) vol(S<) 

= (1 - 1/s) exc(Bi) + s/Njs~/2. (1) 

In the second case, exc(Bi) and exc(B i+1 ) are trivially equal. In both cases, B i+1 
has a nonnegative excess. 

Assume all Xi and Bi constructed in this fashion. Let k be the number of Xi 
which are equal to 1 — 1/s. Then by repeated application of ([T]), the excess of B s 
is at least k(l — 1/ s) k ^jN/ (As)/ 2 > k^jN ~j 's/16. Since each Xi independently with 
probability at least 3/40 is 1 — 1/s, we have E(k) > 3s/40. Consequently, the 
expected excess of B s is at least (3/80) y/sN. Also, by a simple Chernoff bound 
argument, the probability that k is less than |3s/40, is less than exp(— 3s/320). 

This concludes the proof. 
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